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Abstract 



The main operations in Inductive Logic Programming (ILP) are generalization and 
specialization, which only make sense in a generality order. In ILP, the three most impor- 
tant generality orders are subsumption, implication and implication relative to background 
knowledge. The two languages used most often are languages of clauses and languages of 
only Horn clauses. This gives a total of six different ordered languages. In this paper, we 
give a systematic treatment of the existence or non-existence of least generalizations and 
greatest specializations of finite sets of clauses in each of these six ordered sets. We survey 
results already obtained by others and also contribute some answers of our own. 

Our main new results are, firstly, the existence of a computable least generalization 
under implication of every finite set of clauses containing at least one non-tautologous 
function-free clause (among other, not necessarily function-free clauses). Secondly, we show 
that such a least generalization need not exist under relative implication, not even if both 
the set that is to be generalized and the background knowledge are function-free. Thirdly, 
we give a complete discussion of existence and non-existence of greatest specializations in 
each of the six ordered languages. 

1. Introduction 

Inductive Logic Programming (ILP) is a subfield of Logic Programming and Machine Learn- 
ing that tries to induce clausal theories from given sets of positive and negative exam- 
ples. An inductively inferred theory should imply all of the positive and none of the neg- 
ative examples. For instance, suppose we are given -P(O), P(s 2 (0)), P(s 4 (0)), P(s 6 (0)) 
as positive examples and P(s(0)), P(s 3 (0)), P(s 5 (0)) as negative examples. 1 Then the set 
S = {P(0), (P(s 2 (x)) <— P(x))} is a solution: it implies all positive and no negative ex- 
amples. Note that this set can be seen as a description of the even integers, learned from 
these examples. Thus induction of clausal theories is a form of learning from examples. For 
a more extensive introduction to ILP, we refer to (Lavrac & Dzeroski, 1994; Muggleton & 
De Raedt, 1994). 

Learning from examples means modifying a theory to bring it more in accordance with 
the examples. The two main operations in ILP for modification of a theory are generalization 
and specialization. Generalization strengthens a theory that is too weak, while specialization 
weakens a theory that is too strong. These operations only make sense within a generality 
order. This is a relation stating when some clause is more general than some other clause. 

1. Here s 2 (0) abbreviates s(s(0)), s 3 (0) abbreviates s(s(s(0))), etc. 



©1996 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved. 



Nienhuys-Cheng & de Wolf 



The three most important generality orders used in ILP are subsumption (also called 
^-subsumption), logical implication and implication relative to background knowledge. 2 In 
the subsumption order, we say that clause C is more general than D — or, equivalently, D 
is more specific than C — in case C subsumes D. In the implication order C is more general 
than D if C logically implies D. Finally, C is more general than D relative to background 
knowledge S (S is a set of clauses), if {C} U S logically implies D. 

Of these three orders, subsumption is the most tractable. In particular, subsumption 
is decidable, whereas logical implication is not decidable, not even for Horn clauses, as 
established by Marcinkowski and Pacholski (1992). In turn, relative implication is harder 
than implication: both are undecidable, but proof procedures for implication need to take 
only derivations from {C} into account, whereas a proof procedure for relative implication 
should check all derivations from {C} U S. 

Within a generality order, there are two approaches to generalization or specialization. 
The first approach generalizes or specializes individual clauses. We do not discuss this in 
any detail in this paper, and merely mention it for completeness' sake. This approach can 
be traced back to Reynolds' (1970) concept of a cover. It was implemented for example 
by Shapiro (1981) in the subsumption order in the form of refinement operators. However, 
a clause C which implies another clause D need not subsume D. For instance, take C = 
P(f(x)) <- P{x) and D = P(f 2 (x)) <- P{x). Then C does not subsume D, but C \= D. 
Thus subsumption is weaker than implication. A further sign of this weakness is the fact 
that tautologies need not be subsume-equivalent, even though they are logically equivalent. 

The second approach generalizes or specializes sets of clauses. This is the approach 
we will be concerned with in this paper. Here the concept of a least generalization 3 is 
important. The use of such least generalizations allows us to generalize cautiously, avoiding 
over-generalization. Least generalizations of sets of clauses were first discussed by Plotkin 
(1970, 1971a, 1971b). He proved that any finite set S of clauses has a least generalization 
under subsumption (LGS). This is a clause which subsumes all clauses in S and which is 
subsumed by all other clauses that also subsume all clauses in S. Positive examples can 
be generalized by taking their LGS. 4 Of course, we need not take the LGS of all positive 
examples, which would yield a theory consisting of only one clause. Instead, we might 
divide the positive examples into subsets, and take a separate LGS of each subset. That 
way we obtain a theory containing more than one clause. 

For this second approach, subsumption is again not fully satisfactory. For example, if S 
consists of the clauses D x = P{f{a)) <- P{a) and D 2 = P(f(b)) <- P(b), then the LGS of 
S is P(f(y)) <— P(x). The clause P(f(x)) <— P(x), which seems more appropriate as a least 
generalization of S, cannot be found by Plotkin's approach, because it does not subsume 
D\. As this example also shows, the subsumption order is particularly unsatisfactory when 
we consider recursive clauses: clauses which can be resolved with themselves. 



2. There is also relative subsumption (Plotkin, 1971b), which will be briefly touched in Section 4. 

3. Least generalizations are also often called least general generalizations, for instance by Plotkin (1971b), 
Muggleton and Page (1994), Idestam-Almquist (1993, 1995), Niblett (1988), though not by Plotkin 
(1970), but we feel this 'general' is redundant. 

4. There is also a relation between least generalization under subsumption and inverse resolution (Muggle- 
ton, 1992). 
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Because of the weakness of subsumption, it is desirable to make the step from the 
subsumption order to the more powerful implication order. Accordingly, it is important to 
find out whether Plotkin's positive result on the existence of LGS's holds for implication 
as well. So far, the question whether any finite set of clauses has a least generalization 
under implication (LGI) has only been partly answered. For instance, Idestam-Almquist 
(1993, 1995) studies least generalizations under T-implication as an approximation to LGI's. 
Muggleton and Page (1994) investigate self-saturated clauses. A clause is self-saturated if it 
is subsumed by any clause which implies it. A clause D is a self-saturation of C if C and D 
are logically equivalent and D is self-saturated. As Muggleton and Page (1994) state, if two 
clauses C\ and C'2 have self-saturations D\ and D2, then an LGS of D\ and D2 is also an 
LGI of C\ and Ci- This positively answers our question concerning the existence of LGI's 
in the case of clauses which have a self-saturation. However, Muggleton and Page also show 
that there exist clauses which have no self-saturation. Hence the concept of self-saturation 
cannot solve our question in general. 

Use of the third generality order, relative implication, is even more desirable than the 
use of "plain" implication. Relative implication allows us to take background knowledge 
into account, which can be used to formalize many useful properties and relations of the 
domain of application. For this reason, least generalizations under implication relative to 
background knowledge also deserve attention. 

Apart from the least generalization, there is also its dual: the greatest specialization. 
Greatest specializations have been accorded much less attention in ILP than least gener- 
alizations, but the concept of a greatest specialization may nevertheless be useful (see the 
beginning of Section 6). 

In this paper, we give a systematic treatment of the existence and non-existence of 
least generalizations and greatest specializations, applied to each of these three generality 
orders. Apart from distinguishing between these three orders, we also distinguish between 
languages of general clauses and more restricted languages of Horn clauses. Though most 
researchers in ILP restrict attention to Horn clauses, general clauses are also sometimes used 
(Plotkin, 1970, 1971b; Shapiro, 1981; De Raedt & Bruynooghe, 1993; Idestam-Almquist, 
1993, 1995). Moreover, many researchers who do not use general clauses actually allow 
negative literals to appear in the body of a clause. That is, they use clauses of the form 
A <— L\, . . . , L n , where A is an atom and each Li is a literal. These are called program 
clauses (Lloyd, 1987). Program clauses are in fact logically equivalent to general clauses. 
For instance, the program clause P(x) <— Q(x), ^R(x) is equivalent to the non-Horn clause 
P(x) y^Q(x) V R(x). For these two reasons we consider not only languages of Horn clauses, 
but also pay attention to languages of general clauses. 

The combination of three generality orders and two different possible languages of clauses 
gives a total of six different ordered languages. For each of these, we can ask whether least 
generalizations (LG's) and greatest specializations (GS's) always exist. We survey results 
already obtained by others and also contribute some answers of our own. For the sake of 
clarity, we will summarize the results of our survey right at the outset. In the following 
table '+' signifies a positive answer, and ' — ' means a negative answer. 
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Quasi-order 


Horn clauses 


General clauses 


LG 


GS 


LG 


GS 


Subsumption (V) 


+ 


+ 


+ 


+ 


Implication (|=) 






+ for function-free 


+ 


Relative implication (|=s) 








+ 



Table 1: Existence of LG's and GS's 



Our own contributions to this table are threefold. First and foremost, we prove that if S 
is a finite set of clauses containing at least one non-tautologous function-free clause 5 (apart 
from this non-tautologous function-free clause, S may contain an arbitrary finite number 
of other clauses, including clauses which contain functions), then there exists a computable 
LGI of S. This result is on the one hand based on the Subsumption Theorem for resolution 
(Lee, 1967; Kowalski, 1970; Nienhuys-Cheng & de Wolf, 1996), which allows us to restrict 
attention to finite sets of ground instances of clauses, and on the other hand on a modifi- 
cation of some proofs concerning T-implication which can be found in (Idestam-Almquist, 
1993, 1995). An immediate corollary of this result is the existence and computability of an 
LGI of any finite set of function-free clauses. As far as we know, both our general LGI-result 
and this particular corollary are new results. 

Niblett (1988, p. 135) claims that "it is simple to show that there are lggs if the language 
is restricted to a fixed set of constant symbols since all Herbrand interpretations are finite." 
Yet even for this special case of our general result, it appears that no proof has been 
published. Initially, we found a direct proof of this case, but this was not really any simpler 
than the proof of the more general result that we give in this paper. Niblett's idea that the 
proof is simple may be due to some confusion about the relation between Herbrand models 
and logical implication (which is defined in terms of all models, not just Herbrand models). 
We will describe this at the end of Subsection 5.1. Or perhaps one might think that the 
decidability of implication for function-free clauses immediately implies the existence of 
an LGI. But in fact, decidability is not a sufficient condition for the existence of a least 
generalization. For example, it is decidable whether one function-free clause C implies 
another function-free clause D relative to function-free background knowledge. Yet least 
generalizations relative to function-free background knowledge do not always exist, as we 
will show in Section 7. 

Our LGI-result does not solve the general question of the existence of LGI's, but it does 
provide a positive answer for a large class of cases: the presence of one non-tautologous 
function-free clause in a finite S already guarantees the existence and computability of an 
LGI of S, no matter what other clauses S contains. 6 Because of the prominence of function- 
free clauses in ILP, this case may be of great practical signifcance. Often, particularly in 
implementations of ILP-systems, the language is required to be function-free, or function 



5. A clause which only contains constants and variables as terms. 

6. Note that even for function-free clauses, the subsumption order is still not enough. Consider D\ = 
P(x,y,z) <— P(y,z,x) and D2 = P(x,y,z) <— P(z,x,y) (this example is adapted from Idestam- 
Almquist). Di is a resolvent of D2 and D2 and D2 is a resolvent of D\ and D\. Hence D\ and D2 
are logically equivalent. This means that D\ is an LGI of the set {-Di, D2}. However, the LGS of these 
two clauses is P(x, y, z) <— P(u, v, w), which is clearly an over-generalization. 
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symbols are removed from clauses and put in the background knowledge by techniques 
such as flattening (Rouveirol, 1992). Well-known ILP-systems such as Foil (Quinlan & 
Cameron-Jones, 1993), Linus (Lavrac k Dzeroski, 1994) and Mobal (Morik, Wrobel, 
Kietz, & Emde, 1993) all use only function-free clauses. More than one half of the ILP- 
systems surveyed by Aha (1992) is restricted to function-free clauses. Function-free clauses 
are also sufficient for most applications concerning databases. 

Our second contribution shows that a set S need not have a least generalization relative 
to some background knowledge £, not even when S and £ are both function-free. 

Thirdly, we contribute a complete discussion of existence and non-existence of greatest 
specializations in each of the six ordered languages. In particular, we show that any finite 
set of clauses has a greatest specialization under implication. Combining this with the 
corollary of our result on LCI's, it follows that a function-free clausal language is a lattice. 

2. Preliminaries 

In this section we will define some of the concepts we need. For the definitions of 'model', 
'tautology', 'substitution', etc., we refer to standard works such as (Chang & Lee, 1973; 
Lloyd, 1987). A positive literal is an atom, a negative literal is the negation of an atom. 
A clause is a finite set of literals, which is treated as the universally quantified disjunction 
of those literals. A definite program clause is a clause with one positive and zero or more 
negative literals and a definite goal is a clause without positive literals. A Horn clause is 
either a definite program clause or a definite goal. If C is a clause, we use C + to denote 
the positive literals in C, and C~ to denote the negative literals in C. The empty clause, 
which represents a contradiction, is denoted by □. 

Definition 1 Let A be an alphabet of the first-order logic. Then the clausal language C 
by A is the set of all clauses which can be constructed from the symbols in A. The Horn 
language H by A is the set of all Horn clauses which can be constructed from the symbols 
in A. □ 

In this paper, we just presuppose some arbitrary alphabet A, and consider the clausal 
language C and Horn language H based on this A. We will now define three increasingly 
strong generality orders on clauses: subsumption, implication and relative implication. 

Definition 2 Let C and D be clauses and S be a set of clauses. We say that C subsumes 
D, denoted as C y D, if there exists a substitution 6 such that CO C D. 7 C and D are 
subsume- equivalent if C y D and D y C. 

£ (logically) implies C, denoted as £ \= C, if every model of £ is also a model of C. C 
(logically) implies D, denoted as C \= D, if {C} \= D. C and D are (logically) equivalent if 
C \= D and D \= C. 

C implies D relative to £, denoted as C \=y, D, if £U{C} \= D. C and D are equivalent 
relative to £ if C D and D^C. □ 

7. Right from the very first applications of subsumption in ILP, there has been some controversy about 
the symbol used for subsumption: Plotkin (1970) used '<', while Reynolds (1970) used '>'. We use 
here, similar to Reynolds' '>', because we feel it serves the intuition to view C as somehow "bigger" or 
"stronger" than D, if C y D holds. 
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If C does not subsume D, we write C ^ D. Similarly, we use the notation C \/= D and 
C^E D. 

If C y D, then C \= D. The converse does not hold, as the examples in the introduction 
showed. Similarly, if C \= D, then C \=y, D, and again the converse need not hold. Consider 
C = P(a) V ->P(b), D = P(a), and S = {P(b)}: then C \=z D, but Cty=D. 

We now proceed to define a proof procedure for logical implication between clauses, 
using resolution and subsumption. 

Definition 3 If two clauses have no variables in common, then they are said to be stan- 
dardized apart. 

Let C\ = Li V . . . V Li V . . . V L m and C 2 = M x V . . . V Mj V . . . V M n be two clauses 
which are standardized apart. If the substitution 9 is a most general unifier (mgu) of the 
set {Li, —>Mj}, then the clause ((C\ — Li) U (C'2 — Mj))9 is called a binary resolvent of C\ 
and C2. The literals Li and Mj are said to be the literals resolved upon. □ 

If C\ and C'2 are not standardized apart, we can take a variant C 2 of C'2, such that C\ and 
C 2 are standardized apart. For simplicity, a binary resolvent of C\ and C" 2 is also called a 
binary resolvent of Ci and C'2 itself. 

Definition 4 Let C be a clause and 9 an mgu of {L\, . . . , L n } C C (n > 1). Then the 
clause C'9 is called a factor of C. □ 

Note that any non-empty clause C is a factor of itself, using the empty substitution e as an 
mgu of a single literal in C . 

Definition 5 A resolvent C of clauses C\ and C2 is a binary resolvent of a factor of Ci 
and a factor of C2, where the literals resolved upon are the literals unified in the respective 
factors. C\ and C'2 are the parent clauses of C . □ 

Definition 6 Let S be a set of clauses and C a clause. A derivation of C from S is a finite 
sequence of clauses R\, . . . , Rk = C ', such that each Ri is either in S, or a resolvent of two 
clauses in {R\, . . ., If such a derivation exists, we write Sh r C. □ 

Definition 7 Let S be a set of clauses and C a clause. We say there exists a deduction of 
C from S, written as S C, if C is a tautology, or if there exists a clause _D such that 
S h r L> and L> ^ C. □ 

The next result, proved by Nienhuys-Cheng and de Wolf (1996), generalizes Herbrand's 
Theorem: 

Theorem 1 Let S be a set of clauses and C be a ground clause. If S \= C , then there 
exists a finite set T, g of ground instances of clauses in T,, such that T, g \= C . 

The following Subsumption Theorem gives a precise characterization of implication between 
clauses in terms of resolution and subsumption. It was proved by Lee (1967), Kowalski 
(1970) and reproved by Nienhuys-Cheng and de Wolf (1996). 
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Theorem 2 (Subsumption theorem) LetT, be a set of clauses and C be a clause. Then 

The next lemma was first proved by Gottlob (1987). Actually, it is an immediate corollary 
of the subsumption theorem: 

Lemma 1 (Gottlob) Let C and D be non-tautologous clauses. If C \= D, then C + y D + 
and C~ y D~ . 

Proof Since C + y C, if C \= D, then we have C + \= D. Since C + cannot be resolved with 
itself, it follows from the subsumption theorem that C + y D. But then C + must subsume 
the positive literals in D, hence C + y D + . Similarly C~ y D~ . □ 

An important consequence of this lemma concerns the depth of clauses, defined as follows: 

Definition 8 Let t be a term. If t is a variable or constant, then the depth of t is 1. If 
t = f(t\, . . . , t n ), n > 1, then the depth of t is 1 plus the depth of the t{ with largest depth. 
The depth of a clause C is the depth of the term with largest depth in C. □ 

For example, the term t = f(a, x) has depth 2. C = P(f(x)) <— P(g(f(x), a)) has depth 3, 
since g(f(x), a) has depth 3. It follows from Gottlob's lemma that if C \= D, then the depth 
of C is smaller than or equal to the depth of D, for otherwise C + cannot subsume D + or 
C~ cannot subsume D~ . For instance, take D = P(x, f(x, g(y))) <— P(g(a) ,b) , which has 
depth 3. Then a clause C containing a term f(x,g 2 (y)) (depth 4) cannot imply D. 

Definition 9 Let S and S' be finite sets of clauses, xi,...,x n all distinct variables ap- 
pearing in S, and ai,...,a n distinct constants not appearing in S or S'. Then a = 
{xi/ai, . . .,x n /a n } is called a Skolem substitution for S w.r.t. S'. If S' is empty, we just 
say that a is a Skolem substitution for S. □ 

Lemma 2 Let S be a set of clauses, C be a clause, and a be a Skolem substitution for C 
w.r.t. S. Then S |= C iff S |= Co . 

Proof 

=^: Obvious. 

<^=: Suppose C is not a tautology and let a = {x\/ai, . . . , x n /a n }. If S |= Ca, it follows 
from the subsumption theorem that there is a D such that S h r D and D y Ca. Thus there 
is a such that D9 C Cct. Note that since S h r D and none of the constants a\, . . . , a n 
appears in S, none of these constants appears in D. Now let 0' be obtained by replacing in 
9 all occurrences of a 4 - by for every 1 < i < n. Then D9' C C, hence D y C. Therefore 
S \~d C and hence T, \= C. □ 
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3. Least Generalizations and Greatest Specializations 

In this section, we will define the concepts we need concerning least generalizations and 
greatest specializations. 

Definition 10 Let T be a set and R be a binary relation on T. 

1. R is reflexive on T, if xRx for every x £ T. 

2. R is transitive on T, if for every x,y, z £ T, xRy and yRz implies xRz. 

3. R is symmetric on T, if for every x, y £ T, xRy implies yRx. 

4. R is anti-symmetric on T, if for every x,y, z £ T, xRy and implies x = y. 

If i? is both reflexive and transitive on T, we say R is a quasi-order on T. If R is both 
reflexive, transitive and anti-symmetric on T, we say R is a partial order on T. If i? is 
reflexive, transitive and symmetric on T, R is an equivalence relation on T. □ 

A quasi-order i? on T induces an equivalence-relation ~ on T, as follows: we say x,y £ T 
are equivalent induced by i? (denoted x ~ y) if both and Using this equivalence 

relation, a quasi-order i? on T induces a partial order R' on the set of equivalence classes in 
T, defined as follows: if [x] denotes the equivalence class of x (i.e., [x] = {y \ x ~ y}), then 
[x]R'[y] iff xRy. 

We first give a general definition of least generalizations and greatest specializations for 
sets of clauses ordered by some quasi-order, which we then instantiate in different ways. 

Definition 11 Let T be a set of clauses, > be a quasi-order on T, S C T be a finite set of 
clauses and C £ T. If C > D for every D £ S, then we say C is a generalization of S* under 
>. Such a C is called a /eas£ generalization (LG) of S* under > in T, if we have G" > C for 
every generalization C £ T of S* under >. 

Dually, C is a specialization of S* under >, if D > C for every D £ S. Such a C is called 
a greatest specialization (GS) of S* under > in T, if we have C > C' for every specialization 
C" £ T of S under >. □ 

It is easy to see that if some set S has an LG or GS under > in T, then this LG or GS will 
be unique up to the equivalence induced by > in T. That is, if C and D are both LG's or 
GS's of some set S, then we have C ~ D. 

The concepts defined above are instances of the mathematical concepts of (least) upper 
bounds and (greatest) lower bounds. Thus we can speak of lattice-properties of a quasi- or 
partially ordered set of clauses: 

Definition 12 Let T be a set of clauses and > be a quasi-order on T. If for every finite 
subset S of T, there exist both a least generalization and a greatest specialization of S under 
> in T, then the set T ordered by > is called a lattice. □ 

It should be noted that usually in mathematics, a lattice is defined for a partial order 
instead of a quasi-order. However, since in ILP we usually have to deal with individual 
clauses rather than with equivalence classes of clauses, it is convenient for us to define 
'lattice' for a quasi-order here. Anyhow, if a quasi-order > is a lattice on T, then the partial 
order induced by > is a lattice on the set of equivalence classes in T. 
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In ILP, there are two main instantiations for the set of clauses T: either we take a clausal 
language C, or we take a Horn language H. Similarly, there are three interesting choices 
for the quasi-order >: we can use either y (subsumption), |= (implication), or (relative 
implication) for some background knowledge S. In the ^-order, we will sometimes abbrevi- 
ate the terms 'least generalization of S under subsumption' and 'greatest specialization of 
S under subsumption' to 'LGS of S" and 'GSS of S\ respectively. Similarly, in the ^-order 
we will sometimes speak of an LGI (least generalization under implication) and a GSI. In 
the ^^-order, we will use LGR (least generalization under relative implication) and GSR. 

These two different languages and three different quasi-orders give a total of six com- 
binations. For each combination, we can ask whether an LG or GS of every finite set S 
exists. In the next section, we will review the answers for subsumption given by others or 
by ourselves. Then we devote two sections to least generalizations and greatest specializa- 
tions under implication, respectively. Finally, we discuss least generalizations and greatest 
specializations under relative implication. The results of this survey have already been 
summarized in Table 1 in the introduction. 

4. Subsumption 

First we devote some attention to subsumption. Least generalizations under subsumption 
have been discussed extensively by Plotkin (1970). The main result in Plotkin's framework 
is the following: 

Theorem 3 (Existence of LGS in C) Let C be a clausal language. Then for every finite 
SCC, there exists an LGS of S in C. 

If S only contains Horn clauses, then it can be shown that the LGS of S is itself also a 
Horn clause. Thus the question for the existence of an LGS of every finite set S of clauses 
is answered positively for both clausal languages and for Horn languages. 

Plotkin established the existence of an LGS, but he seems to have ignored the GSS in 
(1970, 1971b), possibly because it is a very straightforward result. It is in fact fairly easy 
to show that the GSS of some finite set S of clauses is simply the union of all clauses in S 
after they are standardized apart. 8 We include the proof here. 

Theorem 4 (Existence of GSS in C) Let C be a clausal language. Then for every finite 
SCC, there exists a GSS of S in C. 

Proof Suppose S = {D\, . . . , D n } C C. Without loss of generality, we assume the clauses 
in S are standardized apart. Let D = D\ U . . . U D n , then D{ y D, for every 1 < i < n. 
Now let C G C be such that D{ y C, for every 1 < i < n. Then for every 1 < i < n, there 
is a 9i such that Dfii C C and 9{ only acts on variables in D{. If we let 9 = 9\ U . . . U 9 n , 
then D9 = D X 9 X U . . . U D n 9 n C C. Hence D y C, so D is a GSS of S in C. □ 



8. Note that this has nothing to do with unification. For instance, if S = {P(a, x), P(y,b)}, then the GSS 
of S in C would be P(a,x) V P(y,b). However, if we would instantiate T in Definition 11 to the set of 
atoms, then the greatest specialization of two atoms in the set of atoms should itself also be an atom. 
The GSS of two atoms is then their most general unification (Reynolds, 1970). For instance, the GSS of 
S would in this case be P(a,b). 
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This establishes that a clausal language C ordered by y is a lattice. 

Proving the existence of a GSS of every finite set of Horn clauses in H requires a little 
more work, but here also the result is positive. For example, D = P(a) <— P(f(a)),Q(y) 
is a GSS of D\ = P(x) <— P(f(x)) and Di = P(a) <— Q(y)- Note that D can be obtained 
by applying a = {x/a} (the mgu of the heads of D\ and D2) to _Di U D2, the GSS of D\ 
and D2 in C. This idea will be used in the following proof. Here we assume H contains an 
artificial bottom element (True) _L, such that C y _L for every C G 7i, and _L ^ C for every 
C/ 1. Note that _L is not subsume-equivalent with other tautologies. 

Theorem 5 (Existence of GSS in %) Let % be a Horn language, with _L £ T/ien 
/or every finite S C7i, there exists a GSS of S in H. 

Proof Suppose S = {Di,...,D n } C %. Without loss of generality we assume the 
clauses in S are standardized apart, D\,...,Dk are the definite program clauses in S, 
and Dk+i, ■ ■ ■ , D n are the definite goals in S. If k = (i.e., if S* only contains goals), then 
it is easy to show that D\ U . . . U D n is a GSS of S in If & > 1 and the set {-D^, . . . , D^} 
is not unifiable, then _L is a GSS of S in 7^. Otherwise, let a be an mgu of {D^ , . . . , -D^"}, 
and let _D = _Di(T U . . . U D n a (note that actually _D 8 (T = Di for & + 1 < i < n, since the 
clauses in S are standardized apart). Since D has exactly one literal in its head, it is a 
definite program clause. Furthermore, we have Di y D for every 1 < i < n, since Did C D. 

To show that D is a GSS of S* in 7i, suppose C £ 7i is some clause such that Di y C 
for every 1 < i < n. For every 1 < i < n, let 6i be such that -D 8 ^ C C and only acts 
on variables in A'. Let = #i U . . . U n . For every 1 < i < k, Dj9 = DfOi = C' + , so 9 is 
a unifier of {-D^, . . . , D^}. But u is an mgu of this set, so there is a 7 such that 6 = 177. 
Now Dj = Dtay U . . . U D n aj = D x 9 U . . . U D n 9 = D^ U . . . U DJ n C C. Hence D^C, 
so _D is a GSS of S* in H. See figure 1 for illustration of the case where n = 2. 




Figure 1: _D is a GSS of D\ and L*2 

□ 

Thus a Horn language H ordered by y is also a lattice. 

We end this section by briefly discussing Plotkin's (1971b) relative sub sumption. This is 
an extension of subsumption which takes background knowledge into account. This back- 
ground knowledge is rather restricted: it must be a finite set S of ground literals. Because 
of its restrictiveness, we have not included relative subsumption in Table 1. Nevertheless, 
we mention it here, because least generalization under relative subsumption forms the basis 
of the well-known ILP system Golem (Muggleton & Feng, 1992). 

Definition 13 Let C,D be clauses, S = {L\, . . . , L m } be a finite set of ground literals. 
Then C subsumes D relative to S, denoted by C D, if C h (DU {^L\ , . . . , —>L m }) . □ 
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It is easy to see that is reflexive and transitive, so it imposes a quasi-order on a set of 
clauses. 

Suppose S = {D\, . . . , D n } and S = {L\, . . . , L m }. It is easy to see that an LGS of 
{(Di U {^L\, . . . , -il m }), . . . , (D n U {^L\, . . . , -iL m })} is a least generalization of S under 
y^, so every finite set of clauses has a least generalization under in C. Moreover, if 
each Di is a Horn clause and each Lj is a positive ground literal (i.e., a ground atom), then 
this least generalization will itself also be a Horn clause. Accordingly, if S is a finite set 
of positive ground literals, then every finite set of Horn clauses has a least generalization 
under in H. 

5. Least Generalizations under Implication 

Now we turn from subsumption to the implication order. In this section we will discuss 
LGI's, in the next section we handle GSS's. For Horn clauses, the LGI-question has already 
been answered negatively by Muggleton and De Raedt (1994). 

Let L>i = P(f 2 (x)) <- P(x), D 2 = P(f 3 (x)) <- P(x), C\ = P(f(x)) <- P(x) and 
C 2 = P{f{y)) <r- P{x). Then we have both C\ (= {D X ,D 2 } and C 2 \= {D X ,D 2 }. It is not 
very difficult to see that there are no more specific Horn clauses than C\ and C 2 that imply 
both D\ and D 2 . For C\: no resolvent of C\ with itself implies D 2 and no clause that is 
properly subsumed by C\ still implies D\ and D 2 . For C 2 : every resolvent of C 2 with itself 
is a variant of C 2 , and no clause that is properly subsumed by C 2 still implies D\ and D 2 . 
Thus C\ and C 2 are both "minimal" generalizations under implication of {D\,D 2 }. Since 
C\ and C 2 are not logically equivalent under implication, there is no LGI of {D\, D 2 } in H. 

However, the fact that there is no LGI of {D\, D 2 } in H does not mean that D\ and D 2 
have no LGI in C, since a Horn language is a more restricted space than a clausal language. 
In fact, it is shown by Muggleton and Page (1994) that C = P(f(x)) V P(f 2 (y)) <- P(x) is 
an LGI of D\ and D 2 in C. For this reason, it may be worthwhile for the LGI to consider 
a clausal language instead of only Horn clauses. 

In the next subsection, we show that any finite set of clauses which contains at least 
one non-tautologous function-free clause, has an LGI in C. An immediate corollary of this 
result is the existence of an LGI of any finite set of function-free clauses. In our usage of the 
word, a 'function-free' clause may contain constants, even though constants are sometimes 
seen as functions of arity 0. 

Definition 14 A clause is function-free if it does not contain function symbols of arity 1 
or more. □ 

Note that a clause is function-free iff it has depth 1. In case of sets of clauses which all 
contain function symbols, the LGI-question remains open. 

5.1 A Sufficient Condition for the Existence of an LGI 

In this subsection, we will show that any finite set S of clauses containing at least one 
non-tautologous function-free clause, has an LGI in C. 

Definition 15 Let C be a clause, xi,...,x n all distinct variables in C, and K a set of 
terms. Then the instance set of C w.r.t. K is 1(C,K) = {C9 \ 9 = {x\/t\, . . . , x n /t n }, 
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where t{ £ K, for every 1 < i < n}. If £ = {C\, . . . ,C'k} is a set of clauses, then the 
instance set of £ w.r.t. K is I(£, K) = 1(C\, K) U . . . U I(C fc , K). □ 

For example, if C = P(x)VQ(y) and T = {a, f(z)}, then 1(C, T) = {(P(o)VQ(o)), (P(a) V 
Q(/(z))), (P(f(z)) V Q(a)), (P(/(z)) V Q(/(z)))}. 

Definition 16 Let S* be a finite set of clauses, and a be a Skolem substitution for S. Then 
the term set of S* by u is the set of all terms (including subterms) occurring in Sa. □ 

A term set of S by some a is a finite set of ground terms. For instance, the term set of 
D = P(f 2 (x),y, z) <- P(y, z, f 2 {x)) by a = {x/a, y/b, z/c} is T = {a, f(a),f 2 (a),b, c}. 

Our definition of a term set corresponds to what Idestam-Almquist (1993, 1995) calls 
a 'minimal term set'. In his definition, if a is a Skolem substitution for a set of clauses 
S = {D\, . . . , D n } w.r.t. some other set of clauses S', then a term set of S is a finite set of 
terms which contains the minimal term set of S by a as a subset. 

Using his notion of term set, he defines T -implication as follows: if C and D are clauses 
and T is a term set of {D} by some Skolem substitution a w.r.t. {C}, then C T-implies D 
w.r.t. T if 1(C,T) \= Da. T-implication is decidable, weaker than logical implication and 
stronger than subsumption. Idestam-Almquist (1993, 1995) gives the result that any finite 
set of clauses has a least generalization under T-implication w.r.t. any term set T. However, 
as he also notes, T-implication is not transitive and hence not a quasi-order. Therefore it 
does not fit into our general framework here. For this reason, we will not discuss it fully 
here, and for the same reason we have not included a row for T-implication in Table 1. 

Let us now begin with the proof of our result concerning the existence of LGI's. Consider 
C = P(x, y, z) <— P(z, x, y) and D, a and T as above. Then C \= D and also I(C, T) \= Da, 
since Da is a resolvent of P(f 2 (a), b, c) ^— P(c, f 2 (a), b) and P(c, f 2 (a), b) ^— P(b, c, f 2 (a)), 
which are in I(C, T). As we will show in the next lemma, this holds in general: if C \= D and 
C is function-free, then we can restrict attention to the ground instances of C instantiated 
to terms in the term set of D by some a. 

The proof of Lemma 3 uses the following idea. Consider a derivation of a clause E from 
a set £ of ground clauses. Suppose some of the clauses in £ contain terms not appearing in 
E. Then any literals containing these terms in £ must be resolved away in the derivation. 
This means that if we replace all the terms in the derivation that are not in E, by some 
other term t, then the result will be another derivation of E. For example, the left of figure 2 
shows a derivation of length 1 of E. The term f 2 (b) in the parent clauses does not appear 
in E. If we replace this term by the constant a, the result is another derivation of E (right 
of the figure). 

P(b)i-P(f 2 (b)) PU 2 (b))^Q(a,J(a)) P(6)<-P(a) P(a) <- Q(b, }(a)) 
E = P(b) <r- Q(a,/(a)) E = P(b) <- Q(a,/(a)) 

Figure 2: Transforming the left derivation yields the right derivation 

Lemma 3 Let C be a function-free clause, D be a clause, a be a Skolem substitution for 
D w.r.t. {C} and T be the term set of D by a. Then C \= D iff 1(C,T) \= Da. 
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Proof 

<*=: Since C \= 1(C, T) and 1(C, T) \= Da, we have C \= Da. Now C \= D by Lemma 2. 
If D is a tautology, then Da is a tautology, so this case is obvious. Suppose D is not 
a tautology, then Da is not a tautology. Since C |= Da, it follows from Theorem 1 that 
there exists a finite set £ of ground instances of C , such that £ |= _D<7. By the Subsumption 
Theorem, there exists a derivation from £ of a clause E, such that E y Da. Since £ is 
ground, E must also be ground, so we have E C _D<7. This implies that E only contains 
terms from T. 

Let £ be an arbitrary term in T and let £' be obtained from £ by replacing every term 
in clauses in £ which is not in T , by t. Note that since each clause in £ is a ground instance 
of the function-free clause C , every clause in £' is also a ground instance of C . Now it is 
easy to see that the same replacement of terms in the derivation of E from £ results in 
a derivation of E from £': (1) each resolution step in the derivation from £ can also be 
carried out in the derivation from £', since the same terms in £ are replaced by the same 
terms in £', and (2) the terms in £ that are not in T (and hence are replaced by t) do not 
appear in the conclusion E of the derivation. 

Since there is a derivation of E from £ we have £' |= E, and hence £' |= Da. £' is a 
set of ground instances of C and all terms in £' are terms in T, so £' C 1(C,T). Hence 
l(C,T)^Da. □ 

Lemma 3 cannot be generalized to the case where C contains function symbols of arity 
> 1, take C = P(f(x),y) <— P(z,x) and D = P(f(a),a) <— P(a,f(a)) (from the example 
given on p. 25 of Idestam-Almquist, 1993). Then T = {a, /(a)} is the term set of D and 
we have C \= D, yet it can be seen that 1(C,T) \/= D. The argument used in the previous 
lemma does not work here, because different terms in some ground instance need not relate 
to different variables. For example, in the ground instance P(f 2 (a), a) <— P(a,f(a)) of C, 
we cannot just replace f 2 (a) by some other term, for then the resulting clause would not 
be an instance of C . 

On the other hand, Lemma 3 can be generalized to a set of clauses instead of a single 
clause. If S is a set of function-free clauses, C is an arbitrary clause, and a is a Skolem 
substitution for C w.r.t. S, then we have that S |= C iff X(S, T) \= Ca. The proof is almost 
literally the same as above. 

This result implies that S |= C is reducible to an implication I(E,T) |= Ca between 
ground clauses. Since, by the next lemma, implication between ground clauses is decidable, 
it follows that S |= C is decidable in case S is function-free. 

Lemma 4 The problem whether T, \= C , where £ is a finite set of ground clauses and C is 
a ground clause, is decidable. 

Proof Let C = L\ V . . . V L n and A be the set of all ground atoms occurring in £ and C . 
Now consider the following statements, which can be shown equivalent. 

(1) S|=C. 

(2) £ U . . . , ^L n } is unsatisfiable. 

(3) £ U . . . , ^L n } has no Herbrand model. 

(4) No subset of A is an Herbrand model of £ U . . . , ^L n }. 
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Then (l)o(2). (2)o(3) by Theorem 4.2 of (Chang & Lee, 1973). Since also (3)o(4), we 
have (l)o(4). (4) is decidable because A is finite, so (1) is decidable as well. □ 

Corollary 1 The problem whether £ \= C , where £ is a finite set of function-free clauses 
and C is a clause, is decidable. 

The following sequence of lemmas more or less follows the pattern of Idestam-Almquist's 
(1995) Lemma 10 to Lemma 12 (similar to Lemma 3.10 to Lemma 3.12 of Idestam-Almquist, 
1993). There he gives a proof of the existence of a least generalization under T-implication 
of any finite set of (not necessarily function-free) clauses. We can adjust the proof in such 
a way that we can use it to establish the existence of an LGI of any finite set of clauses 
containing at least one non-tautologous function-free clause. 

Lemma 5 Let S be a finite set of non-tautologous clauses, V = {x\, . . .,x m } be a set of 
variables and let G = {C\, C'2, ■ ■ ■} be a (possibly infinite) set of generalizations of S under 
implication. Then the set G' = T.{G\, V) U I(C2, V) U . . . is a finite set of clauses. 

Proof Let d be the maximal depth of the terms in clauses in S. It follows from Lemma 1 
that G (and hence also G') cannot contain terms of depth greater than d, nor predicates, 
functions or constants other than those in S. The set of literals which can be constructed 
from predicates in S and from terms of depth at most d consisting of functions and constants 
in S and variables in V, is finite. Hence the set of clauses which can be constructed from 
those literals is finite as well. G' is a subset of this set, so G' is a finite set of clauses. □ 



Lemma 6 Let D be a clause, C be a function-free clause such that C \= D , T = {t\, . . . , t n } 
be the term set of D by a, V = {x\, . . . , x m } be a set of variables and m > n. If E is an 
LGS ofl(C, V), then E \= D. 

Proof Let 7 = {x\/t\, . . . , x n /t n , x n+ i/t n , . . . , x m /t n } (it does not matter to which terms 
the variables x n+ i, . . .,x m are mapped by 7, as long as they are mapped to terms in T). 
Suppose 1(C, V) = {Cpi, . . . , Cpk}- Then 1(C, T) = {Cpij, . . . , Cptj}- Let E be an LGS 
of 1(C,V) (note that E must be function-free). Then for every 1 < i < k, there are 9i 
such that E6i C Cpi. This means that E0ij C Cpij and hence E0ij \= Cpij, for every 
1 < i < k. Therefore E \= 1(C, T). 

Since C \= D, we know from Lemma 1 that constants appearing in C must also appear 
in D. This means that a is a Skolem substitution for D w.r.t. {C}. Then from Lemma 3 
we know 1(C,T) \= Da, hence E \= Da. Furthermore, since E is an LGS of 1(C,V), all 
constants in E also appear in C, hence all constants in E must appear in D. Thus a is also 
a Skolem substitution for D w.r.t. {E}. Therefore E \= D by Lemma 2. □ 

Consider C = P(x,y,z) <— P(y,z,x) and D =<— Q(w). Both C and D imply the clause 
E = P(x, y, z) <— P(z, x, y), Q(b). Now note that CUD = P(x, y, z) <— P(y, z, x), Q(w) also 
implies E. This holds for clauses in general, even in the presence of background knowledge 
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S. The next lemma is very general, but in this section we only need the special case where 
C and D are function-free and S is empty. We need the general case to prove the existence 
of a GSR in Section 8. 

Lemma 7 Let C , D and E be clauses such that C and D are standardized apart and let S 
be a set of clauses. If C \=y, E and D \=y, E, then C U D \=y, E. 

Proof Suppose C \=y, E and D \=y, E, and let M be a model of S U {C U D}. Since C 
and D are standardized apart, the clause C U D is equivalent to the formula V(C) V V(_D) 
(where V(C) denotes the universally quantified clause C). This means that M is a model of 
Cora model of D. Furthermore, M is also a model of S, so it follows from S U {C} |= E 
or S U {D} \= E that M is a model of £\ Thus Y, U {C U D} \= E , hence C U D h=s E. □ 

Now we can prove the existence of an LGI of any finite set S of clauses which contains at 
least one non-tautologous and function-free clause. In fact we can prove something stronger, 
namely that this LGI is a special LGI. This is an LGI that is not only implied, but actually 
subsumed by any other generalization of S: 

Definition 17 Let C be a clausal language and S be a finite subset of C. An LGI C of S 
in C is called a special LGI of S in C, if C y C for every generalization C G C of S under 
implication. □ 

Note that if D is an LGI of a set containing at least one non-tautologous function-free 
clause, then by Lemma 1 D is itself function-free, because it should imply the function- 
free clause(s) in S. For instance, C = P(x,y,z) <— P(y,z,x),Q(w) is an LGI of D\ = 
P(x, y, z) <— P(y, z, x), Q(f(a)) and Di = P(x, y, z) <— P(z, x, y), Q(b). Note that this LGI 
is properly subsumed by the LGS of {D\, D2}, which is P(x, y, z) <— P(x' , y', z'), Q(w). An 
LGI may sometimes be the empty clause □, for example if S = {P(a),Q(a)}. 

Theorem 6 (Existence of special LGI in C) Let C be a clausal language. If S is a 
finite set of clauses from C and S contains at least one non-tautologous function-free clause, 
then there exists a special LGI of S in C. 

Proof Let S = {D\, . . . , D n } be a finite set of clauses from C, such that S contains at least 
one non-tautologous function-free clause. We can assume without loss of generality that S 
contains no tautologies. Let a be a Skolem substitution for S, T = {t\, . . . , t m } be the term 
set of S by a, V = {x\, . . . , x m } be a set of variables and G = {C\, C'2, ■ ■ ■} be the set of 
all generalizations of S under implication in C. Note that □ £ G, so G is not empty. Since 
each clause in G must imply the function-free clause(s) in S, it follows from Lemma 1 that 
all members of G are function-free. By Lemma 5, the set G' = I(Ci, V) U I(C2, V) U . . . is 
a finite set of clauses. Since G' is finite, the set of I(C;, V)s is also finite. For simplicity, let 
{l(d,V), . . .,l(C k ,V)} be the set of all distinct l(C t ,V)s. 

Let Ei be an LGS of I(Ci, V), for every 1 < i < k, such that E\, . . . , Ej~ are standardized 
apart. For every 1 < j ' < n, the term set of Dj by a is some set {tj 1 , . . - ,tj s } C T, such 
that to > j s . From Lemma 6, we have that Ei \= Dj, for every 1 < i < k and 1 < j ' < n, 
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hence E{ \= S. Now let F = E\ U . . . U Ek, then we have F \= S from Lemma 7 (applying 
the case of Lemma 7 where S is empty). 

To prove that F is a special LGI of S, it remains to show that Cj y F, for every j ' > 1. 
For every j > 1, there is an i (1 < i < k), such that l(Cj,V) = I(Ci,V). So for this i, 

is an LGS of I(Cj, V). Cj is itself also a generalization of I(Cj, V) under subsumption, 
hence Cj y E{. Then finally Cj y F, since E{ CF. □ 

As a consequence, we also immediately have the following: 

Corollary 2 (Existence of LGI for function-free clauses) LetC be a clausal language 
Fhen for every finite set of function-free clauses S C C, there exists an LGI of S in C. 

Proof Let S be a finite set of function-free clauses in C. If S only contains tautologies, 
any tautology will be an LGI of S. Otherwise, let S' be obtained by deleting all tautologies 
from S. By the previous theorem, there is a special LGI of S'. Clearly, this is also a special 
LGI of S itself in C. □ 

This corollary is not trivial, since even though the number of Herbrand interpretations 
of a language without function symbols is finite (due to the fact that the number of all 
possible ground atoms is finite in this case), S may nevertheless be implied by an infinite 
number of non-equivalent clauses. This may seem like a paradox, since there are only 
finitely many categories of clauses that can "behave differently" in a finite number of finite 
Herbrand interpretations. Thus it would seem that the number of non-equivalent function- 
free clauses should also be finite. This is a misunderstanding, since logical implication (and 
hence also logical equivalence) is defined in terms of all interpretations, not just Herbrand 
interpretations. For instance, define D\ = P(a, a) and P(b, b), C n = {P(xi, Xj) \ i / j,l < 
hi ^ n }- Then we have C n \= {D\, D2}, C n \= C„+i and C„+i ^ C n , for every n > 1, see 
(van der Laag & Nienhuys-Cheng, 1994). 

Another interesting consequence of Theorem 6 concerns self-saturation (see the intro- 
duction to this paper for the definition of self-saturation). If C is a special LGI of some set 
S, then it is clear that C is self-saturated: any clause which implies C also implies S and 
hence must subsume C, since C is a special LGI of S. Now consider S = {D}, where D 
is some non-tautologous function-free clause. Then a special LGI C of S will be logically 
equivalent to D. Moreover, since this C will be self-saturated, it is a self-saturation of D. 

Corollary 3 If D is a non-tautologous function-free clause, then there exists a self-satura- 
tion of D . 

5.2 The LGI is Computable 

In the previous subsection we proved the existence of an LGI in C of every finite set S of 
clauses containing at least one non-tautologous function-free clause. In this subsection we 
will establish the computability of such an LGI. The next algorithm, extracted from the 
proof of the previous section, computes this LGI of S. 
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LGI- Algorithm 

Input: A finite set S of clauses, containing at least one non-tautologous function- 
free clause. 

Output: An LGI of S in C. 

1. Remove all tautologies from S (a clause is a tautology iff it contains literals 
A and —>A), call the remaining set S'. 

2. Let m be the number of distinct terms (including subterms) in S', let 
V = {x\, . . . , x m }. (Notice that this m is the same number as the number 
of terms in the term set T used in the proof of Theorem 6.) 

3. Let G be the (finite) set of all clauses which can be constructed from 
predicates and constants in S' and variables in V . 

4. Let {U\, . . ., U n } be the set of all subsets of G. 

5. Let Hi be an LGS of Ui, for every 1 < i < n. These Hi can be computed 
by Plotkin's (1970) algorithm. 

6. Remove from {Hi, . . . , H n } all clauses which do not imply S' (since each 
Hi is function-free, by Corollary 1 this implication is decidable), and stan- 
dardize the remaining clauses {Hi, . . ., H q } apart. 

7. Return the clause H = H\ U . . . U H q . 

The correctness of this algorithm follows from the proof of Theorem 6. First notice that 
H \= S by Lemma 7. Furthermore, note that all T(Ci,Vy& mentioned in the proof of 
Theorem 6, are elements of the set {U\, . . ., U n }. This means that for every Ei in the set 
{Ei, . . . , Ek} mentioned in that proof, there is a clause Hj in {Hi, . . . , H q } such that Ei 
and Hj are subsume-equivalent. Then it follows that the LGI F = Ei U . . . U Ek of that 
proof subsumes the clause H = Hi U . . . U H q that our algorithm returns. On the other 
hand, F is a special LGI, so F and H must be subsume-equivalent. 

Suppose the number of distinct constants in S' is c and the number of distinct variables in 
step 2 of the algorithm is m. Furthermore, suppose there are p distinct predicate symbols in 
S', with respective arities Then the number of distinct atoms that can be formed 

from these constants, variables and predicates, is / = X)i=i( c + m ) a ' > an d the number of 
distinct literals that can be formed is 2 • /. The set G of distinct clauses which can be 
formed from these literals is the power set of this set of literals, so \G\ = 2 2 ''. Then the set 
{Ui, . . ., U n } of all subsets of G contains 2' G ' = 2 22 ' members. 

Thus the algorithm outlined above is not very efficient (to say the least). A more efficient 
algorithm may exist, but since implication is harder than subsumption and the computation 
of an LGS is already quite expensive, we should not put our hopes too high. Nevertheless, 
the existence of the LGI-algorithm does establish the theoretical point that the LGI of any 
finite set of clauses containing at least one non-tautologous function-free clause is effectively 
computable. 

Theorem 7 (Computability of LGI) Let C be a clausal language. If S is a finite set of 
clauses from C, and S contains at least one non-tautologous function-free clause, then the 
LGI of S in C is computable. 
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6. Greatest Specializations under Implication 

Now we turn from least generalizations under implication to greatest specializations. Find- 
ing least generalizations of sets of clauses is common practice in ILP. On the other hand, 
the greatest specialization, which is the dual of the least generalization, is used hardly ever. 
Nevertheless, the GSI of two clauses D\ and D2 might be useful. Suppose that we have one 
positive example e + and two negative examples e~[ and and suppose that D\ implies e + 
and ej~, while D2 implies e + and e^~. Then it might very well be that the GSI of D\ and 
D2 still implies e + , but does not imply either e~[ or e^~. Thus we could obtain a correct 
specialization by taking the GSI of D\ and Di- 

It is obvious from the previous sections that the existence of an LGI of S is quite hard 
to establish. For clauses which all contain functions, the existence of an LGI is still an open 
question, and even for the case where S contains at least one non-tautologous function-free 
clause, the proof was far from trivial. However, the existence of a GSI in C is much easier 
to prove. In fact, a GSI of a finite set S is the same as the GSS of S, namely the union of 
the clauses in S after these are standardized apart. 

To see the reason for this dissymmetry, let us take a step back from the clausal framework 
and consider full first-order logic for a moment. If <j)\ and fa are two arbitrary first-order 
formulas, then it can be easily shown that their least generalization is just <j)\ A fa: this 
conjunction implies <j)\ and (f>i, and must be implied by any other formula which implies 
both 4>i and fa. Dually, the greatest specialization is just <j)\V <j>2- this is implied by both <j)\ 
and (f>2, and must imply any other formula that is implied by both <j)\ and fa. See figure 3. 



Figure 3: Least generalization and greatest specialization in first-order logic 

Now suppose 4>i and fa are clauses. Then why do we have a problem in finding the LGI 
of 4>i and <^>2? The reason for this is that <j)\ A fa is not a clause. Instead of using <j)\ A(f>2, we 
have to find some least clause which implies both clauses <j)\ and fo. Such a clause appears 
quite hard to find sometimes. 

On the other hand, in case of specialization there is no problem. Here we can take 
4>\ V (f>2 as GSI, since <j)\ V (f>2 is equivalent to a clause, if we handle the universal quantifiers 
in front of a clause properly. If <j)\ and are standardized apart, then the formula <j)\ V (f>2 
is equivalent to the clause which is the union of <j)\ and fo. This fact was used in the proof 
of Lemma 7. 

Suppose S = {D\, . . . , D n }, and D' 1 ,...,D' n are variants of these clauses which are 
standardized apart. Then clearly D = D[ U . . . U D' n is a GSI of S, since it follows from 
Lemma 7 that any specialization of S under implication is implied by D. Thus we have the 
following result: 



4>l A 02 




4>i v 4>2 
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Theorem 8 (Existence of GSI in C) Let C be a clausal language. Then for every finite 
S C C, there exists a GSI of S in C. 

The previous theorem holds for clauses in general, so in particular also for function-free 
clauses. Furthermore, Corollary 2 guarantees us that in a function-free clausal language an 
LGI of every finite S exists. This means that the set of function-free clauses quasi-ordered 
by logical implication is in fact a lattice. 

Corollary 4 (Lattice-structure of function-free clauses under |=) A function-free 
clausal language ordered by implication is a lattice. 

In case of a Horn language 7i, we cannot apply the same proof method as in the case of a 
clausal language, since the union of two Horn clauses need not be a Horn clause itself. In 
fact, we can show that not every finite set of Horn clauses has a GSI in H. Here we can use 
the same clauses that we used to show that sets of Horn clauses need not have an LGI in 
7i, this time from the perspective of specialization instead of generalization. 

Again, let D 1 = P(f(x)) <- P(x), D 2 = P(f 3 (x)) <- P(x), C\ = P(f(x)) <- P(x) and 
C 2 = P(f 2 (y)) <- P(x). Then C\ \= {D U D 2 } and C 2 \= {D U D 2 }, and there is no Horn 
clause D such that D \= D\, D \= D 2 , C\ \= D and C 2 \= D. Hence there is no GSI of 
{C\,C 2 } in U. 

7. Least Generalizations under Relative Implication 

Implication is stronger than subsumption, but relative implication is even more powerful, 
because background knowledge can be used to model all sorts of useful properties and 
relations. In this section, we will discuss least generalizations under implication relative 
to some given background knowledge S (LGR's). In the next section we treat greatest 
specializations under relative implication. 

First, we will prove the equivalence between our definition of relative implication and a 
definition given by Niblett (1988, p. 133). He gives the following definition of subsumption 
relative to a background knowledge S (to distinguish it from our notion of subsumption, 
we will call this 'N-subsumption'): 9 

Definition 18 Clause C N-subsumes clause D with respect to background knowledge S if 
there is a substitution 9 such that S h (CO — > D) (here '— >■' is the implication-connective, 
and 'h' is an arbitrary complete proof procedure). □ 

Proposition 1 Let C and D be clauses and S be a set of clauses. Then C N-subsumes D 
with respect to T, iff C \=j; D. 

Proof Consider the following six statements, which can be shown equivalent. 

(1) C N-subsumes D with respect to S. 

(2) There is a substitution 9 such that S h (CO — > D). 

(3) There is a substitution 9 such that S |= (CO — > D). 

9. Niblett attributes this definition to Plotkin, though Plotkin gives a rather different definition of relative 
subsumption in (Plotkin, 1971b), as we have seen in Section 4. 
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(4) There is a substitution 9 such that £ U {C9} \= D. 

(5) SU{C}^ D. 

(6) C £>• 

(l)o(2) by definition. (2)o(3) by the completeness of K (3)o(4) by the Deduction 
Theorem. (4)=^(5) is obvious and (5)=^(4) follows from letting 9 be the empty substitution, 
hence (4)o(5). Finally, (5)o(6) by definition. Thus these six statements are equivalent. □ 

Since |= is the special case of \=y, where £ is empty, our counterexamples to the existence 
of LGI's or GSI's in H are also counterexamples to the existence of LGR's or GSR's in H. 
In other words, the ' — '-entries in the second row of Table 1 carry over to the third row. 

For general clauses, the LGR-question also has a negative answer. We will show here 
that even if S and £ are both finite sets of function-free clauses, an LGR of S relative to £ 
need not exist. Let D 1 = P(a), D 2 = P(b), S = {D u D 2 }, and £ = {(P(a) V^Q(x)), (P(b) V 
^Q(x))}. We will show that this S has no LGR relative to £ in C. 

Suppose C is an LGR of S relative to £. Note that if C contains the literal P(a), 
then the Herbrand interpretation which makes P(a) true and which makes all other ground 
literals false, would be a model of £ U {C} but not of D 2 , so then we would have C P>2- 
Similarly, if C contains P(b) then C D\- Hence C cannot contain P(a) or P(b). 

Now let d be a constant not appearing in C. Let D = P(x) V Q(d), then D 
S. By the definition of an LGR, we should have D C. Then by the subsumption 
theorem, there must be a derivation from S U {D} of a clause E, which subsumes C. The 
set of all clauses which can be derived (in or more resolution-steps) from S U {D} is 
SU{D}U{(F(a) V P(x)), (P(b) VP(i))}. But none of these clauses subsumes C, because C 
does not contain the constant d, nor the literals P(a) or P(b). Hence D C, contradicting 
the assumption that C is an LGR of S relative to S in C. 

Thus in general the LGR of S relative to S need not exist. However, we can identify a 
special case in which the LGR does exist. This case might be of practical interest. Suppose 
S = {L\, . . . , L m } is a finite set of function-free ground literals. We can assume S does 
not contain complementary literals (i.e., A and —>A), for otherwise S would be inconsistent. 
Also, suppose S = {D\, . . . , D n } is a set of clauses, at least one of which is non-tautologous 
and function-free. Then C \=z D t iff {C}US |= D t iff C \= D t V -.(Li A ... A L m ) 
iff C \= Di V -iZ/i V ... V ^L m . This means that an LGI of the set of clauses {(-Di V 
—>Li V ... V —>L m ), . . ., (D n V —>Li V ... V ^L m )} is also an LGR of S relative to S. If some 
-DfcV-iLi V. . .V—*L m is non-tautologous and function-free, this LGI exists and is computable. 
Hence in this special case, the LGR of S relative to S exists and is computable. 

8. Greatest Specializations under Relative Implication 

Since the counterexample to the existence of GSI's in H is also a counterexample to the 
existence of GSR's in 7i, the only remaining question in the ^£-OT<ier j s the existence of 
GSR's in C. The answer to this question is positive. In fact, like the GSS and the GSI, the 
GSR of some finite set S in C is just the union of the (standardized apart) clauses in S. 

Theorem 9 (Existence of GSR in C) Let C be a clausal language and E C C. Then for 
every finite S C C, there exists a GSR of S relative to £ in C. 
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Proof Suppose S = {D\, . . . , D n } C C. Without loss of generality, we assume the clauses 
in S are standardized apart. Let D = D\ U . . . U D n , then D{ \=y, D, for every 1 < i < n. 
Now let C G C be such that D{ \=y, C , for every 1 < i < n. Then from Lemma 7, we have 
D \=z C. Hence D is a GSR of S relative to S in C. □ 



9. Conclusion 

In ILP, the three main generality orders are subsumption, implication, and relative impli- 
cation. The two main languages are clausal languages and Horn languages. This gives a 
total of six different ordered sets. In this paper, we have given a systematic treatment of 
the existence or non-existence of least generalizations and greatest specializations in each of 
these six ordered sets. The outcome of this investigation is summarized in Table 1. The only 
remaining open question is the existence or non-existence of a least generalization under 
implication in C for sets of clauses which all contain function symbols. 

Table 1 makes explicit the trade-off between different generality orders. On the one 
hand, implication is better suited as a generality order than subsumption, particularly in 
case of recursive clauses. Relative implication is still better, because it allows us to take 
background knowledge into account. On the other hand, we can see from the table that as 
far as the existence of least generalizations goes, subsumption is more attractive than logical 
implication, and logical implication is in turn more attractive than relative implication. For 
subsumption, least generalizations always exist. For logical implication, we can only prove 
the existence of least generalizations in the presence of a function-free clause. And finally, 
for relative implication, least generalizations need not even exist in a function-free language. 
In practice this means that we cannot have it all. If we choose to use a very strong generality 
order such as relative implication, least generalizations only exist in very limited cases. On 
the other hand, if we want to guarantee that least generalizations always exist, we are 
committed to the weakest generality order: subsumption. 
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